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Abstract 

Polymerases e and S are the main enzymes that replicate eukaryotic DNA. Accurate replication occurs through 
Watson-Crick base pairing and also through the action of the polymerases' exonuclease (proofreading) domains. 
We have recently shown that germline exonuclease domain mutations (EDMs) of POLE and P0LD1 confer a high 
risk of multiple colorectal adenomas and carcinoma (CRC). P0LD1 mutations also predispose to endometrial cancer 
(EC). These mutations are associated with high penetrance and dominant inheritance, although the phenotype 
can be variable. We have named the condition polymerase proofreading-associated polyposis (PPAP). Somatic 
POLE EDMs have also been found in sporadic CRCs and ECs, although very few somatic P0LD1 EDMs have 
been detected. Both the germline and the somatic DNA polymerase EDMs cause an 'ultramutated', apparently 
microsatellite-stable, type of cancer, sometimes leading to over a million base substitutions per tumour. Here, 
we present the evidence for POLE and P0LD1 as important contributors to the pathogenesis of CRC and EC, and 
highlight some of the key questions in this emerging field. 
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Introduction 

In comparison with other cancers, there exist a 
relatively large number of syndromes in which a 
high lifetime risk of colorectal cancer (CRC) is 
caused by inheriting a mutation in a single gene. 
The specific Mendelian CRC syndromes (and their 
mutant genes) are familial adenomatous polyposis 
(APC), Lynch syndrome/HNPCC (mismatch repair 
genes MSH2 , MLHl , MSH6, PMS2), Peutz-Jeghers 
syndrome (LKBl/STKll ), juvenile polyposis {SMAD4 , 
BMPRIA), MUTYH -associated polyposis (the base 
excision repair gene MUTYH), and hereditary mixed 
polyposis (GREMl ). All of these conditions, except 
Mf/ry// -associated polyposis, are dominantly inher- 
ited, although a recessive version of HNPCC exists, 
in which both copies of one of the mismatch repair 
(MMR) genes are mutated. Each syndrome differs in 
its clinical features, but in most cases, there is a pri- 
mary predisposition to multiple (10s- 1000s) adenomas 
or other benign polyps, with a secondary CRC risk, 
probably owing to progression of the benign lesions. 
The exception is Lynch syndrome, in which there is 
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usually a small excess of adenomas and the primary 
predisposition is to CRC. 

An ongoing question for several years has been 
whether there are any more high-penetrance CRC 
predisposition genes to be found. There certainly exist 
patients whose clinical features and family history 
make them a priori likely to carry a high-penetrance 
CRC predisposition allele, but who have no mutations 
in the known Mendelian CRC genes. One such group 
of patients comprises individuals with hyperplastic 
polyposis syndrome (HPPS), although the hypothetical 
HPPS gene(s) has not yet been identified. Another 
set of patients likely to carry high-penetrance CRC 
mutations has multiple adenomas. Typically, these 
tumours number 10-100 at presentation or after a few 
years of screening. Patients may present before the 
age of 60 and they have often developed one or more 
CRCs. Some of these individuals come from extensive 
CRC pedigrees, although others have no significant 
family history of colorectal tumours. 

In this article, we summarize the recent identifica- 
tion of DNA polymerase e and 8 mutations in familial 
colorectal cancer cases, many of whom have multiple 
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adenomas. In common with other genes such as APC 
and SMAD4 , the polymerases are additionally somati- 
cally mutated in a recently-reported subset of sporadic 
CRCs [1]. There is also accruing evidence that, like the 
mismatch repair genes, POLE and POLDl mutations 
play roles in endometrial carcinogenesis. 

High-penetrance germline DNA polymerase e and 
8 mutations cause colorectal and endometrial 
cancers 

Using a combination of whole-genome sequencing 
of highly-selected multiple adenoma patients, linkage 
analysis, and studies of loss of heterozygosity (LOH) 
in tumours, followed by replication in a large set of 
familial CRC cases, we recently identified two specific 
germline mutations that caused carriers to develop mul- 
tiple colorectal adenomas and CRC [2]. These muta- 
tions were in two related DNA polymerase genes: 
POLE (p.Leu424Val) and POLDl (p.Ser478Asn). Nei- 
ther mutation was present in nearly 7000 UK con- 
trols or in public databases of controls. Subsequently, 
we found an additional, probably pathogenic mutation, 
POLDl p.Pro327Leu, in a further multiple adenoma 
patient. 

Both mutations show dominant inheritance and con- 
fer high-penetrance predisposition to multiple colorec- 
tal adenomas, large adenomas, early-onset CRC, or 
multiple CRCs (Figure 1). For this reason, we have 
called the disease polymerase proofreading-associated 
polyposis (PPAP). The phenotype varies among carri- 
ers: some have tens of adenomas that do not always 
appear to progress rapidly to cancer, whereas others 
have a small number of large adenomas or carcinomas. 
The histological features of the tumours are unremark- 
able. They are mostly conventional adenomas and car- 
cinomas that occur throughout the large bowel. There is 
currently no evidence that mutation carriers are at risk 
of upper-gastrointestinal tumours, but female carriers 
of POLDl p.Ser478Asn have a greatly increased risk 
of endometrial cancer (EC). Overall, the PPAP phe- 
notype overlaps with the phenotypes associated with 
germline mutations in APC, MUTYH , and the MMR 
genes. POLE p.Leu424Val, POLDl p.Ser478Asn, and 
POLDl p.Pro327Leu all map within the proofreading 
(exonuclease) domain of the respective enzyme, sug- 
gesting that deficient proofreading repair during DNA 
replication is the cause of our patients' tumours. It is of 
note that POLDl in particular additionally participates 
in both MMR and base excision repair. However, the 
families' tumours are microsatellite-stable. Although 
the reasons for this are currently unclear, it is possi- 
ble to speculate that the strand slippage that results in 
insertion-deletion mutations does not involve a mis- 
paired, single-strand intermediate that is recognized by 
the polymerase proofreading domain, and, moreover, 
that polymerase proofreading has a minor role in mis- 
match repair. 
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Somatic polymerase s mutations in sporadic 
colorectal and endometrial cancers 

Separately from the work that discovered germline 
POLE and POLDl mutations in colorectal cancer 
patients, POLE was highlighted as a somatically 
mutated gene in CRC by The Cancer Genome Atlas 
(TCGA) exome sequencing project [1]. A set of can- 
cers with a very large number of coding mutations 
(over 50 per 10^ bases), in the absence of MMR 
defects or micros atellite instability (MSI), was also 
identified, and this set of cancers overlapped with 
the POLS -mutant set. Subsequently, it was found that 
almost all of the hypermutant, MS-stable cancers had 
POLE exonuclease domain mutations (EDMs) [2,3]. 
The seven POLE EDMs in the TCGA cohort, out of 
a total of 226 CRCs (3%), were all missense changes, 
although the germline p.Leu424Val change was absent. 
Two recurrent changes were found, p.Val411Leu and 
p.Ser459Phe. These data are consistent with those 
from another CRC exome sequencing project [4] that 
found two of 74 (3%) cancers to have acquired POLE 
p.Pro286Arg mutations. Recently released, but unpub- 
lished data from the TCGA have confirmed codons 
286, 411, and 459 as mutation hotspots [5]. Inter- 
estingly, POLE residue 286 is homologous to the 
probably pathogenic germline mutation that we have 
reported at residue 327 in POLDl . However, there 
is no good evidence of pathogenic, somatic POLDl 
EDMs. 

The TCGA [5] and we ourselves [6] have found 
somatic POLE EDMs in ECs. These occur at a slightly 
higher frequency (~7%) than that in CRCs. As was the 
case for CRCs, POLDl EDMs were very rare common, 
seen in just one tumour in each cohort (approximately 
0.5%). The POLE mutation spectra of the ECs showed 
overlap with those of the CRCs, with p.Pro286Arg 
and p.Val411Leu particularly frequent. p.Ser297Phe 
was also found in two ECs. In addition, a somatic 
p.Leu424Val change - the mutation present in the 
germline of CRC cases - was present in two ECs, and 
one cancer possessed a mutation of a residue that forms 
the exonuclease catalytic site (p.Asp275Val) (Figure 2). 
Like CRCs, the POLE -mutant ECs tended to be MSI- 
negative and had an ultramutator phenotype. Figure 2 
shows the locations of the somatic and germline POLE 
and POLDl mutations within the exonuclease domain. 



The normal roles of polymerase s and 8 

POLE and POLDl are related B family polymerases. 
They form the major catalytic and proofreading 
subunits of the Pols and Pol8 enzyme complexes 
that respectively synthesize the leading and lagging 
strands in DNA replication [7,8]. Their proofreading 
(exonuclease) function detects and removes misincor- 
porated bases in the daughter strand through failed 
complementary pairing with the parental strand. The 

] Pathol 20\3; 230: 148-153 
www.thejournalofpathology.com 



ISO 



S Briggs and I Tomlinson 



o- 



33 yo 



-0 



CRC41 

FT 



o o 



10 adenomas {up to 4cm), 
43-65 yo 



CRC. 28 yo 
28 adenomas 28-35 yo 



EC, 33 yo 
CRC, 53 yo 



EC, 52 yo 
28 adenomas and 
2 hyperplastic polyps, 
52-68 yo 



O 6 6- □ 6 



EC, 45 yo 
17 adenomas, 
53-65 yo 



6 □ 



■D- 



2 astrocytomas. 26 yo 

7 adenomas and 
2 hyperplastic polyps, 
33-45 yo 



6- 

1 adenoma, 
29-43 yo 



6 adenomas and 
"multiple" hyperplastic polyps, 
29-39 yo 



Figure 1. Pedigree of a P0LD1 p.Ser478Asn family. Shading denotes tliose affected witin multiple (> 5) colorectal adenomas (adenomas) 
and/or early-onset colorectal cancer (CRC). In addition, three women developed endometrial carcinoma (EC). + denotes individuals tested 
and found to be gene carriers, and — denotes tested non-carriers. Ages denote the time interval over which colorectal polyps developed or 
the time at which cancer occurred. Note that one non-gene carrier developed a very small colorectal adenoma by age 43 years and that 
one carrier developed two astrocytomas, raising the possibility that POLDl mutations also predispose to this tumour type. 
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Figure 2. The structure of POLE and POLDl demonstrating the 
position of key mutations. Conserved exo motifs l-V within the 
exonuclease domain are highlighted in blue. Green circles denote 
germline mutations; grey circles denote somatic mutations. 



high fidelity of DNA replication is in part due to 
very low error rates in dNTP incorporation by the 
polymerase (10"'* to 10~^) and in part due to proof- 
reading by the exonuclease domain, which improves 
this fidelity approximately 100-fold. POLE and 
POLDl have greatest homology (23% identity, 37% 
similarity) over their exonuclease domains (residues 



268-471 of POLE and 304-517 of POLDl). Both 
genes are ubiquitously expressed and show high 
levels of evolutionary conservation, especially in the 
exonuclease domain. A number of yeast mutants exist 
in the homologues of POLE and POLDl , and these 
models have shown that mutator phenotypes can result 
from outside in the polymerase domain [9-14]. Other 
variants actually have improved DNA repair capacity 
[15], although they also have lower processivity (rate 
of synthesis of the daughter strand). 

The Pols and PolS enzymes are both heterote- 
tramers in higher eukaryotes. The accessory subunits 
(POLE2/3/4 and POLD2/3/4) are involved in reg- 
ulating synthesis and in binding co-factors such as 
PCNA. It is of note that a common polymorphism 
within P0LD3 has been found to be associated with a 
modestly increased risk of CRC in the general north- 
ern European population [16]. Whether this acts in a 
similar way to POLDl mutations is currently unclear. 

POLDl - and perhaps also POLE - is thought 
to play an additional role in new strand synthesis 
as part of the processes of base excision repair and 
MMR. POLE is involved in break-induced replication, 
a form of double-strand break repair in which the 
homologous chromosome is used as a template, 
resulting in copy-neutral LOH. Whether these aspects 
of DNA polymerase function are important for 
tumour development is unknown, as is the explanation 
for the near-absence of somatic POLDl EDMs. 
At the very least, it is striking that defects in at 
least three pathways involved in the repair of base 
pair-level mutations can predispose to colorectal 
tumours. 
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How do POLE and POLDl mutations cause 
tumorigenesis? 

The most common somatic and germline mutations in 
POLE and POLDl have been mapped onto a hybrid 
structure of yeast DNA polymerase (3iay) and T4 poly- 
merase (Figure 3). The residues equivalent to the two 
germhne mutations {POLE p.Leu424Val and POLDl 
p.Ser478Asn) pack together at the interface between 
two helices that form the base of the exonuclease active 
site. Mutations are predicted to distort the packing of 
the helices; this will in turn be propagated to the active 
site, affecting nuclease activity [2]. The residue equiv- 
alent to the most common somatic POLE mutation at 
amino acid 286 localizes to the DNA binding pocket 
adjacent to the exonuclease active site, with its side 
chain very close to the nascent single-stranded DNA, 
and substitutions at this site are predicted significantly 
to perturb the structure of the DNA binding pocket. 
Structural analysis also shows that amino acid 297 
interacts with exonuclease catalytic site residue 275, 
and mutation here would probably alter the active site 
conformation. Interestingly, POLE residue 411, whilst 
conserved, is not predicted to interact with DNA or 
catalytic site residues, and the effects on tumorigene- 
sis may be through secondary effects on the binding 
pocket. Although the structural data do not explain 
the recurrent nature of some POLE mutations, they 
strongly suggest that the POLE and POLDl muta- 
tions impair polymerase proofreading. For most of 
these mutations, moreover, studies in model organ- 
isms including T4 bacteriophage, yeast, and mice have 
confirmed these effects [2,10,17]. For example, the 
p.Prol23Leu mutation in the T4 bacteriophage is at 
the equivalent residue to human Pro286 and produces 
a strong mutator phenotype [18]. 

Hypermutation is thus a very plausible consequence 
of EDM POLE and POLDl mutations. Whether this is 
the only tumour-promoting consequence of these muta- 
tions remains unclear. It will also be intriguing to deter- 
mine whether proofreading deficiency has any effect on 
polymerase processivity, since negative effects on this 
function may be selectively deleterious for the cell. 



Pathways of tumorigenesis 

Exome sequencing data from CRCs and ECs with 
somatic POLE EDMs show that the coding regions 
alone of these tumours have acquired a mean of about 
5000 somatic base substitutions [1,5]. All types of base 
substitution are increased in frequency compared with 
cancers without EDMs, and C:G T: A changes gener- 
ally remain the most common. However, the mutation 
spectrum is changed, with a particular increase in the 
proportion of G:C^T:A and A:T^C:G transver- 
sions. Although there is considerable variation in the 
number of mutations among cancers with EDMs, there 
is some evidence that specific mutations have different 
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Figures. Pymol-generated image of POLE and POLDl EDIVls on 
a composite structure of yeast PolS (PDB 3IAY) and tine ssDNA 
component (yellow) of the T4 polymerase complex (PDB 1N0Y). 
Mutant amino acids are shown in red (POLE) or blue (POLDl). The 
key mutations generally cluster around the active site (D275) close 
to the ssDNA, an exception being V41 1 which lies some distance 
away and may act through affecting the positions of other residues 
closer to the active site. 

effects on the mutation spectrum. For example, cancers 
that carry p.Pro286Arg show a much stronger bias 
towards transversions than cancers with p.Val411Leu. 
Interestingly, the exome sequence data show that the 
somatic mutations resulting from deficient exonuclease 
proofreading tend to occur at sites flanked by an A 
base on the positive DNA strand, rather than by T, 
G or C. The causes for this observation are currently 
unknown, although one possibility is the ability of 
mutations to be corrected by the MMR machinery. 

POLE and POLDl are not classical tumour sup- 
pressor genes, as loss-of-function mutations appear 
not to be pathogenic. Instead, there is loss of a specific 
function, proofreading, that is unlikely to be achieved 
through protein- truncating mutations. Furthermore, it 
is not clear whether 'two hits' at POLE or POLDl 
are required for tumourigenesis. In Po/e-mutant mice, 
a mutator phenotype and increased frequency of 
tumour formation are only seen when Pole mutations 
are homozygous [17]. In humans, some, but not all, 
tumours from patients with germline POLE or POLDl 
mutations show LOH, although data on other forms 
of 'second hit' are lacking in these tumours. LOH 
has not been reported in cancers with somatic POLE 
mutations, although a few of these tumours have 
protein-truncating mutations that could act as 'second 
hits'. From a functional perspective, a Pole or PolS 
protein with a heterozygous EDM would probably 
cause an increase in mutation frequency because 50% 
of polymerase activity would be error-prone. However, 
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it is conceivable that this error rate is insufficient to 
overwhelm the other repair systems such as MMR. 

A further unanswered question is when POLE 
and POLDl mutations occur during tumourigene- 
sis. For colorectal tumours with either somatic or 
germline POLE and POLDl EDMs, the mutation 
spectrum of the APC gene shows deficiency of 
frameshift (insertion -deletion) changes compared with 
other CRCs. Moreover, some APC mutations that 
are generally rare in CRCs are relatively common in 
POLE- and POLDl -mutant tumours, an example being 
the Argil 14X change. These data suggest that the 
POLE and POLDl mutations occur before APC muta- 
tions. However, at least in sporadic CRCs with POLE 
EDMs, the frequency of pathogenic mutations in the 
other known driver genes is low, perhaps suggesting 
that they follow an atypical pathway of tumorigenesis. 



POLE and POLDl mutations outside the 
exonuclease domain 

POLE and POLDl are both quite large genes, with 
protein-coding regions of about 6.6 and 3.3 kb, respec- 
tively. Inevitably, therefore, they will acquire many 
'passenger' mutations in cancers. The causal role 
of EDMs has been deduced as a result of find- 
ing (i) germline mutations in PPAP and (ii) the 
so-called 'ultramutator', MSI-negative phenotype in 
sporadic cancers. No such clues exist in support of a 
pathogenic role for the non-exonuclease domain POLE 
and POLDl mutations that are found in 3-4% of 
CRCs and ECs. In fact, the majority of these non-EDM 
changes occur in MSI-positive cancers and most seem 
likely to be passengers. 



Future prospects 

Evidently, the identification of germline and somatic 
POLE and POLDl mutations that cause CRC and 
EC is only the first stage in understanding how those 
changes act and, if possible, exploiting them for cancer 
prevention and therapy. Perhaps the immediate priority 
is to determine the full Pols and Pol8 mutation spectra, 
to determine which mutations are pathogenic, and then 
to understand their effects. The somatic mutation spec- 
trum of POLE and POLDl in the common cancers will 
be increasingly well described in the coming months 
as further large-scale sequencing programmes come to 
fruition. It will also be important to test POLE as a 
prognostic marker. In the germline, similar mutation 
profiling efforts will undoubtedly be performed, one 
scenario being that there exist a number of germline 
Pole and PolS variants with different magnitudes of 
effect on risk. Given that common polymorphisms have 
been addressed by CRC and EC GWAS, the unchar- 
acterized germline risk variants are likely to be indi- 
vidually uncommon (<5% allele frequency) and some 
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may not necessarily be in the exonuclease domains 
of POLE and POLDl . It will be necessary to obtain 
evidence for their effects using a variety of functional 
assays. A particularly interesting issue is why germline 
POLE or POLDl mutations can cause cancer, yet only 
POLE is somatically mutated. This has a parallel in the 
MMR genes, of which four can be mutated in Lynch 
syndrome, but only MLHl plays a role somatically. It 
may be that POLDl has an as yet unidentified essential 
function that precludes its somatic mutation or that the 
effects of somatic mutation are too weak (or indeed 
too strong) to be effective other than in the germline 
setting. Finally, the discovery of a new type of CRC 
and EC based on POLE mutations rather than the 
established classifiers of MSI and chromosomal insta- 
bility raises the prospect of future similar discoveries, 
leading to an increasingly refined classification of can- 
cer based on DNA analysis, which is potentially more 
robust than analysis of gene or protein expression. 
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